Cross-modal prediction in audio-visual communication
نویسندگان
چکیده
Ram R. Rao Georgia Institute of Technology Atlanta, GA 30332 [email protected] Tsuhan Chen AT&T Bell Laboratories Holmdel, NJ 07733 [email protected] ABSTRACT In this paper, we present a novel means for predicting the shape of a person's mouth from the corresponding speech signal and explore applications of this prediction to video coding. The prediction is accomplished by modeling the probability distribution of the audiovisual features by a Gaussian mixture density. The optimal estimate for the visual features given the acoustic features can then be computed using this probability distribution. The ability to predict a person's mouth shape from the corresponding audio leads to a number of interesting joint audio-video coding strategies. In the cross-modal predictive coding system described in this paper, a model-based video coder compares measured visual parameters with predicted visual parameters, and sends the di erence between the two to the receiver. Since the decoder also receives the acoustic data, it can form the prediction and then reconstruct the original parameters by adding the transmitted error signal.
منابع مشابه
Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration
There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influ...
متن کاملOp-brai150076 1..6
Developmental vision is deemed to be necessary for the maturation of multisensory cortical circuits. Thus far, this has only been investigated in animal studies, which have shown that congenital visual deprivation markedly reduces the capability of neurons to integrate cross-modal inputs. The present study investigated the effect of transient congenital visual deprivation on the neural mechanis...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Onsets Coincidence for Cross- Modal Analysis
Cross-modal analysis offers information beyond one extracted from individual modalities. Consider a non-trivial scene, that includes several moving visual objects, of which some emit sounds. The scene is sensed by a camcorder having a single microphone. A task for audio-visual analysis is to assess the number of independent audio-associated visual objects (AVOs), pinpoint the AVOs’ spatial loca...
متن کاملDetection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony
Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of...
متن کاملAudiovisual quality integration for interactive communications
This paper investigates multi-modal aspects of audiovisual quality assessment for interactive communication services. It shows how perceived auditory and visual qualities integrate to an overall audiovisual quality perception in different experimental contexts. Two audiovisual experiments are presented and provide experimental data for the present analysis. First, two experimental contexts are ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996